Introduction

In anticipation of the Dallas area’s next large-scale comprehensive planning process, the NCTCOG invited us to forecast urban development in 2029. We construct a binary logistic regression model that predicts change in development by gathering land cover data and land use/population/development data.

Data and Analysis

We use land cover data from 2008 and 2019 to make a land-cover development change map (Figure 1.)

Figure 1. Land Cover Development Change Figure 2. Land Cover Types, 2008

Figure 2 shows a series of maps of land cover types, including developed, farm, forest, other undeveloped, water, and wetlands.

Population and change are critical components of predicting demand. Census data for 2009 and 2019 were downloaded quickly using the tidycensus package. As illustrated below, these data are downloaded at a census tract geography. From Figure 3, there’s apparent population growth in the “sprawl belt.”

We downloaded Highway data from the Texas Department of Transportation, and the distance from each grid cell to its nearest highway segment is measured, as shown in Figure 5 & 6.

At the center of our model is a hypothesis that development demand must, in part, be a function of the pattern of existing development. Development occurs when the market believes a higher and better use may bring an investment return. In the case of a sprawling region like Dallas, assuming the requisite demand, there is a clear return on investment for converting farmland to suburban housing.

Figure 3. The population of Dallas and surrounding areas in 2009 and 2019 Figure 4. New Development and Highways of Dallas and surrounding areas Figure 5. Distance to Highways

While urban land is valuable, contemporary urbanism in regions like Dallas shows us that suburban locations can be quite desirable as well. Hinterland locations do not offer direct access to jobs and cultural amenities. Instead, residents trade-off accessibility for larger lots and bigger homes; as well as a bundle of public services like school quality. Developers are attracted to suburban and exurban locations because of cheap land on ‘greenfield’ sites like farms and open space.

In Dallas, as in many sprawling regions of the U.S. the economic incentives that underlie sprawl likely encourage both the accessibility and leapfrog models of development. For our purposes, however, features must be created to associate these patterns with development. Without them, the model may lack the appropriate spatial experience on which to forecast growth.

To keep it simple, we develop features associated with accessibility-based patterns. In reality, the analyst should develop a series of applicable features and test which best associate with the outcome of interest. The problem becomes infinitely more difficult when one realizes that sprawl patterns may differ throughout the study area - if for instance, land use restrictions varied by county. Below we estimate models using logistic regression, but higher level machine learning algorithms, most notably, Random Forest, are more adept at dealing with non-linearities across space.

Accessibility is measured by way of a spatial lag hypothesizing that new development is a function of distance to existing development. The shorter the distance, the more accessible a grid cell is to existing development.

Figure 6. Spatial Lag to 2009 Development

In this section we explore the extent to which each features is associated with development change. The below bar plots indicate that new development more likely to appear near highway and existing developments. As expected, the accessibility features have a positive influence on urban development.

The Figure 8 indicate that new development is more likely to appear with a denser population and more population changes.

#figure7&8 Figure 7. New Development as a Function of the Continuous Variables Figure 8. New Development as a Function of Factor Variables

Model

With all the data, six logistic regression models are estimated to predict development change between 2009 and 2019 - with each succeeding model more sophisticated than the last. The data was split into 50% training/test sets. Models are estimated on the training set.

The figure 9 shows the Psuedo R-Squared associated with each model. Model 6 was picked for future prediction.

Figure 9. McFadden R-Squared by Model Figure 10. Development Predictions – Low Threshold

The figure 10 shows both true positives (Sensitivity) and true negatives (Specificity) for each grid cell by threshold type.

We are concerned about whether our model is comparable to each county in the study area despite any possible differences in land use or land use planning. Thus, we examine the goodness of fit metrics for spatial cross-validation instead of regular cross-validation. For the most part, these confusion matrix metrics suggest the model is generalizable to those counties that underwent significant development change.

Prediction for 2029

Demand-side

At this point, a simple but useful model has been trained to predict urban development between 2009 and 2019 as a function of baseline features from 2009 including land cover, built environment, and population. Then, we updated our features to reflect a 2019 baseline. Having done so, predictions from our new model would then be for 2029.

Figure 12. Population Change by County

First, we update the population change and spatial lag of development features in our model. Note that in this demand-change scenario, the 2029 population was projected for each county, and then distributed to grids based on the cell’s existing population. The spatial lag of development features describes how predicted new development relates in space to old development. Finally, Model6 is used to predict for 2029 given the updated population change and lag development features.

The map of predicted probabilities that results is best thought of as a measure of predicted development demand in 2029.

Figure 13. development Demand in 2029: Predicted Probabilities Figure 14. Land Cover Types, 2019

Environmental sensitivity

Sensitive lands lost gives a sense of how development has affected the natural environment in the recent past. The environmentally sensitive indicators include the total amount of wetlands and forest land cover area in 2019.

Figure 15. Sensitive land lost: 2009-2019 Figure 16. Sensitive regions

Supply-side

Form a peripheral transportation network by creating a new light rail system between Tarrant, Denton, Collin, and Dallas. By promoting development in the areas surrounding the Denton and Collin light rail lines, the accessibility of transportation infrastructure will attract more incoming residents.

Figure 17. Mapping 2029

In the future, given the current population development in Dallas, the planning goals for Dallas and Tarrant 2029 are primarily dense residential development.

Population growth will be concentrated in the areas proposed for development, and development by 2029 will be more connected by convenient express rail transit, with the Dallas-Fort Worth metroplex as the centerpiece to drive development to the north in Denton and Collin.

Figure 18. Population Change by County: 2019 - 2029

Its county population changes before and after are shown in Figure 18. comparison, the populations of Dallas, Tarrant, Denton, and Collin have all increased by more than 15%. However, due to the new transportation network that drives the development of Denton and collin, the population development of Denton and collin will increase by 3% compared to before the change. And Dallas and Tarrant would see a 1% decrease in population growth from before the change. Compare the post-planning population change as shown in Figure 19.

Figure 19. Population by county:2029

From a sustainable and ecological point of view, the population of Dallas is relatively saturated under the development pattern of urban sprawl, and the main development opportunities are in Denton and collin, so the early layout of infrastructure is strategic for regional development.

Given the large amount of private land and farmland in Ellis, Kaufman, and Rockwall that is not functional for real estate development or as a residential area for Dallas’ working population, development is not recommended.

Figure 20. County Specific Allocation Metrics

This plot provides both supply and demand-side analytics by county. The plot gives a sense of development demand (Demand-Side), suitable land for development (Suitable), and sensitive land (Not Suitable).

In Collin and Denton Counties, areas north of Dallas, the data suggests both population and development demand will increase. At the same time, there is a high rate of developable farmland & undeveloped land and a low supply of sensitive land. Collin and Denton are well suitable for new developments.

Allocation

Allocation is the final stage of the urban growth modeling process. Now that both demand and supply are understood, Planners can allocate development rights accordingly. Of course, this could take many forms of regulation, including zoning, subdivision approval, or outright conservation. This section visualizes demand and supply for two counties, Collin and Denton. The data suggests that they are more conducive to growth than Kaufman, Ellis, and Rockwall Counties.

Figure 21&22. Development Potential, Projected Population, 2029: Collin

First, development demand was predicted for Collin and Denton. Then a layer includes indicators for both previously developed land and environmentally unsuitable land. This layer then overlayed atop development demand and projected population change to give Collin and Denton the entire supply and demand-side picture.

There are some clear development opportunities. Significant infill opportunities exist along the southwest boundary of Collin and the southeast boundary of Denton, where population change was projected to be greatest. There are also many environmentally suitable lands near the waterbody in Denton and the west center of the Collin around a highway interchange. These would be ideal spaces for large footprint, suburban shopping.

Figure 23&24. Development Potential, Projected Population, 2029: Denton

To actually allocating land to development, more nuanced understanding of how local land use laws might play a role. All the previous model, prediction and allocation could give people information to make future development decisions for Dallas and surrounding counties.